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Abstract 

Background: Heat shock transcriptional factors (Hsfs) play a crucial role in plant responses to biotic and abiotic 
stress conditions and in plant growth and development. Apple {Malus domestica Borkh) is an economically 
important fruit tree whose genome has been fully sequenced. So far, no detailed characterization of the Hsf gene 
family is available for this crop plant. 

Results: A genome-wide analysis was carried out in Malus domestica to identify heat shock transcriptional factor 
(Hsf) genes, named MdHsfs. Twenty five MdHsfs were identified and classified in three main groups (class A, B and 
C) according to the structural characteristics and to the phylogenetic comparison with Arabidopsis thaliana and 
Populus trichocarpa. Chromosomal duplications were analyzed and segmental duplications were shown to have 
occurred more frequently in the expansion of Hsf genes in the apple genome. Furthermore, MdHsfs transcripts were 
detected in several apple organs, and expression changes were observed by quantitative real-time PCR (qRT-PCR) 
analysis in developing flowers and fruits as well as in leaves, harvested from trees grown in the field and exposed 
to the naturally increased temperatures. 

Conclusions: The apple genome comprises 25 full length Hsf genes. The data obtained from this investigation 
contribute to a better understanding of the complexity of the Hsf gene family in apple, and provide the basis for 
further studies to dissect Hsf function during development as well as in response to environmental stimuli. 
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Background 

Trees are sessile organisms with long lifespans that regu- 
larly experience climatic fluctuations in their native en- 
vironment. Therefore, survival and reproduction is 
dependent upon an array of protective mechanisms that 
involve the activation of a wide range of transcriptional 
factors, and their products are considered to play a cen- 
tral role in response to extreme physiological conditions. 
There is evidence that members of the heat shock tran- 
scriptional factor (Hsf) family are important regulators 
in sensing and signaling of different environmental stres- 
ses [1]. Similarly to many other transcription factors, the 
Hsfs have a modular structure containing signature 
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domains structurally and functionally conserved 
throughout the eukaryotic kingdom. A common core 
structure in the Hsfs is composed of an N-terminal 
DNA binding domain (DBD), characterized by a central 
helix-turn-helix motif that specifically binds to the heat 
shock elements (HSE) in the target promoters, and an 
adjacent bipartite oligomerization domain (HR-A/B) 
composed of hydrophobic heptad repeats [2]. Hsf tri- 
merization via the formation of a triple stranded alpha- 
helical coiled-coil is a prerequisite for high affinity DNA 
binding and, subsequently, for transcriptional activity. 
Other Hsf functional modules include clusters of basic 
amino acids essential for nuclear import (NLS), leucine- 
rich export sequences important for nuclear export 
(NES), and a less conserved C-terminal activator domain 
(CTAD) rich in aromatic, hydrophobic and acidic amino 
acids, the so-called AHA motifs [2,3]. 
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In contrast to Saccharomyces cerevisiae, Caenorhabdi- 
tis elegans, and Drosophila melanogaster, that each pos- 
sesses only a single Hsf gene, plant genomes contain 
large numbers of Hsf genes, up to 52 [1,4,5]. Based on 
structural characteristics and phylogenetic comparisons, 
plant Hsfs are grouped into classes A, B and C [2,6]. All 
class A and C Hsfs have an extended HR-A/B region 
due to the insertion of 21 (Class A) or seven (class C) 
amino acid residues between A and B parts of the HR- 
A/B region. On the contrary, in class B Hsfs, the HR-A/ 
B region does not contain insertions. In addition, se- 
quence comparisons and structural analyses indicate that 
the combination of a AHA motif with an adjacent nu- 
clear export signal NES represents a peculiar signature 
domain for many plant class A Hsfs [6,7] . 

After the release of the whole genomic sequences of 
several plant organisms, including rice (Oryza sativa), 
maize (Zea mays), poplar (Populus trichocarpa), medi- 
cago (Medicago truncatula), tomato (Solarium ly coper si- 
con), the Hsfs family was analyzed extensively, both to 
place each member in an organized nomenclature sys- 
tem and to provide maps of their expression [7-10]. 

Recently, the full genome sequence of the domesti- 
cated apple (Malus domestica Borkh) has been published 
[11]. This provides a useful genomic tool to study this 
economically important fruit crop. As transcriptional 
factors, Hsfs are involved in different aspects of plant life 
including tolerance to biotic/abiotic stresses and devel- 
opmental processes [12-14]. Therefore, this gene family 
represents an important group of transcriptional factors 
to investigate and to characterize. Genome scale analyses 
of the transcriptional response during development and 
to environmental stimuli require a precise and complete 
annotation of genes in order to provide reliable and ex- 
haustive data. Therefore, the aim of this study was to an- 
notate the full length Hsf genes in apple, and to analyze 
their expression profiles by quantitative real time PCR 
(qRT-PCR) in different organs/tissues from plants grown 
in the field and exposed to natural environmental condi- 
tions. The results of this work provide a foundation to 
better understand the functional structure and genomic 
organization of the Hsf gene family in apple, and will be 
undoubtedly useful in future gene cloning and functional 
studies. 

Results 

Identification, classification and duplication of Hsf genes 
in the Malus domestica genome 

CDS sequences corresponding to putative Hsf genes from 
Malus domestica (MdHsfs) were searched in the Apple 
Genome vl.O [15]. As a result, 36 genes encoding for puta- 
tive MdHsfs proteins were identified. All candidate MdHsf 
proteins were surveyed, and incomplete sequences for the 
DBD domain and for the remaining functional domains 



were removed. This resulted in the selection of twenty five 
complete sequences. These MdHsf genes were distributed 
on 12 of the 17 apple chromosomes with the largest num- 
ber, comprised of six Hsf genes, detected on chromosome 
15 (Table 1). According to the multiple sequence alignment 
of the DBD and HR-A/B region, 16 genes were determined 
to be Class A, seven genes were identified as Class B and 
two were classified as Class C. 

Gene duplication events have been indicated as an im- 
portant mechanism in the evolution of plant genomes 
[16]. Therefore, duplications of MdHsfs were also ana- 
lyzed. As shown in Figure 1, a total of 12 duplicated 
gene pairs of MdHsfs were identified, including 11 seg- 
mental duplication events between chromosomes (e.g. 
MdHsf C la and MdHsf C lb) as well as one tandem dupli- 
cation event within the same chromosome, e.g. 
MdHsf A3c and MdHsf ASb. MdHsf ASc was the only Hsf 
involved in both duplication events, as it was dupli- 
cated with MdHsfA3b in tandem on chromosome 14 
and also segmentally duplicated with MdHsfA3a on 
chromosome 12. 



Table 1 List of Hsfs genes in the Malus domestica genome 



Gene name 


Chromosomal localization 


Size (aa) 


MW(kDa) 


Pi 


MdHsfAlo 


Chr6 


MDP00005 17644 


540 


59.37 


4.76 


MdHsfAlb 


ChrlO 


MDP00001 56337 


546 


61.14 


4.96 


MdHsf Ale 


Chrl3 


MDP0000232623 


550 


60.07 


6.01 


MdHsfAld 


Chr16 


MDP0000259645 


580 


64.34 


5.04 


MdHsfA2o 


Chr8 


MDP0000489886 


380 


42.42 


4.73 


MdHsfA2b 


Chr15 


MDP0000243895 


377 


42.24 


4.63 


MdHsfA3o 


Chr12 


MDP0000131346 


516 


56.26 


4.18 


MdHsfA3b 


Chr14 


MDP0000606400 


455 


50.37 


6.43 


MdHsfA3c 


Chr14 


MDP0000174161 


582 


64.34 


4.89 


MdHsfA4o 


Chr5 


MDP00001 55849 


420 


47.23 


5.62 


MdHsfA5o 


Chr9 


MDP0000301101 


483 


53.81 


5.08 


MdHsfA5b 


Chr15 


MDP0000613011 


482 


54.19 


5.48 


MdHsfA8o 


ChrW 


MDP0000191541 


414 


46.89 


4.55 


MdHsfA8b 


Chrl3 


MDP00001 72376 


411 


44.86 


5.10 


MdHsf A9a 


Chr2 


MDP00001 94672 


713 


75.89 


6.86 


MdHsfA9b 


Chr15 


MDP00003 19456 


482 


53.29 


4.86 


MdHsfBlo 


Chr2 


MDP0000527802 


294 


32.26 


8.76 


MdHsfBlb 


Chr15 


MDP0000578396 


232 


28.40 


4.67 


MdHsfB2o 


Chrl 


MDP00001 55667 


276 


30.97 


5.96 


MdHsfB3o 


Chrl2 


MDP0000622590 


243 


27.77 


7.22 


MdHsfB3b 


Chr14 


MDP0000202716 


243 


27.82 


7.82 


MdHsfB4o 


Chr8 


MDP0000209135 


381 


42.85 


7.62 


MdHsfB4b 


Chr15 


MDP00001 29357 


383 


43.19 


7.64 


MdHsfClo 


Chr2 


MDP0000230456 


324 


36.25 


6.27 


MdHsfClb 


Chr15 


MDP0000320827 


344 


38.36 


5.02 



MW: molecular weight; pi: Isoelectric point. 
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Figure 1 Localization and duplication of the Hsf genes in the apple genome. Circular visualization of the 25 Hsfs mapped on the different 
chromosomes in the apple genome was obtained using the Circos software. Picture shows only the chromosomes containing MdHsf genes, and 
chromosome number is indicated on the inner side. Segmental duplications were joined by the lines, while the tandem duplication of MdHsf A3 b 
and MdHsfA3c is indicated by an asterisk. 

J 



Analysis of conserved domains in the apple Hsf proteins 

Prediction of the typical signature domains present in 
the MdHsfs protein sequences was carried out by com- 
paring the identified apple Hsfs to those of homologous, 
well characterized proteins of model plants such as to- 
mato or Arabidopsis [2,6,7]. Table 2 lists five conserved 
motifs that were identified by sequence alignment, and 
their positions in the protein sequences. All the MdHsfs 
showed the presence of the highly conserved DBD do- 
main in the N-terminal region, consisting of a three- 
helical bundle (HI, H2 and H3) and a four-stranded 
antiparallel p-sheet. The length of the DBD motif was 
quite variable with the smaller size observed for 
MdHsfBlb. The presence of the coiled-coil structure 
characteristic of leucine-zipper type protein interaction 
domains, which is a property of the HR-A/B region, was 
instead predicted in all MdHsfs proteins by using MAR- 
COIL tool. Furthermore, the majority of the MdHsfs 
showed the presence of NES and NLS domains which 



were described to be essential for shuttling Hsfs between 
nucleus and cytoplasm [7]. Additional sequence com- 
parison allowed the identification of AHA motifs in the 
center of the C-terminal activation domains, as it is 
expected in the A-type Hsfs. By contrast, these domains 
were not identified in the B and C-type MdHsfs. 

A second approach was used to identify and to verify 
domain prediction in the MdHsf proteins, by using the 
MEME motif search tool. Thirty corresponding consen- 
sus motifs were detected (Figure 2; Table 3). The major- 
ity of MdHsfs displayed the presence of the motifs 1, 2, 
3, 4, 5 which correspond to highly conserved regions in- 
cluding the DBD and HR-A/B region domains. In 
addition, the inspection of motif distribution revealed 
that some of them were only present in specific classes 
of the MdHsf family. For example, motif 10 was repre- 
sentative of A-type Hsf members such as MdHsfAla-Ald, 
MdHsfA4a, MdHsfA8a, and it contained the signature 
domains corresponding to NES sequence. Similarly, 
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Table 2 Functional motifs of apple Hsfs 



Gene name 


DBD 


HR-A/B 


NLS 


NES 


AHA 


hAAUcfA 7 n 

IVIunSiAl 0 




\ DA-IAJ 


(~>A~>\ MI/I/DDI 1/1/ 

\lAl) NKKKKLKK 


(jUzJ IVIUNL I LMVIu 


AUA lAQA\ PilCACI l/Pi\A/PiPi 

AHA (4j4J DILArLKDVVDD 


IVlunSTAI 0 


1 7-1 52 


1 74-240 


l^)C^A\ MI/I/DDI DD 


pzyj /VI IN M 1 1 LLWI 


AUA (AQ1\ P>IC\A/C^CI TA C 

AHA (4ozJ UlrVVLUrLlAb 


MdHsfAlc 


16-156 


1 73-243 


(268) NKKRRLPR 


(533) MNHITEQMQ 


AHA (486) DIFWEQFLTAS 


MdHsfAld 


103-196 


217-284 


(307) NKKRRLKR 


(563) MDNLTEKMG 


AHA (516) DIEAFLKDWDD 


MdHsfA2o 


30-132 


147-213 


(228) KNRK-X 7 -RKRR 


(368) LLDQMGYQ 


AHA1 (318) ETIWEELWSD AHA2 (360) DWGKDLQD 


MdHsfA2b 


38-131 


145-212 


(227) KNR-X 6 -RKRR 


(365) LVDQMGYL 


AHA1 (315) ETIWEELWSD AHA2 (355) DWGEDLQD 


MdHsfA3o 


99-209 


226-285 


(297) KTRRKFVK 


nd 


AHA1 (431) EDIWSMGFGV AHA2 (450) ELWGNPVNY AHA3 

(a~ic\\ i n\AA/m/~Di r\ auia/i iaqz\ ip>i/\a/d auhc 
(4/UJ LUVVVUIUrHJ AHA4 (4O0J lUKWrAHUo 


IViUnSlAjU 


yy-zoz 


I Z 


/Tnn\ l/Pil/"CCD\/DDl/P\/l/ 


na 


na 


IVlOnSlAjC 


99-244 


ZOJ-OZ4 


I^IACW l/Pil/^CCD\/DDI/C\/l/ 


na 


AUA1 CPil\A/C\/lMCPi\/ MJA1 (c:TQ\ MCI \A/rMDVMV 

AHA 1 pUUJ LUIVVolVINrUV AHAz [D \ o) NLLVVuNrAN Y 

AHA3 (539) LDVWDIDPLQ AHA4 (555) INKWPAHES 


MdHsfA4o 


10-103 


123-190 


(208) RKRRLPR 


(407) LTEQMGHL 


AHA1 (252) LTFWEDTIHD AHA2 (356) DGFWEQFLTE 


MdHsfA5o 


12-105 


116-183 


(194) RK-X 10 -KKRR 


(477) AETLTL 


AHA (431) DVFWEQFLTE 


MdHsfA5b 


12-105 


117-183 


(194) RK-X 10 -KKRR 


(477) AETLTL 


AHA (431) DVFWEQFLTE 


MdHsfA8o 


18-111 


129-199 


(177) RNRLR 


(389) TEQMGHL 


AHA (308) DGAWEQFLLA 


MdHsfA8b 


18-111 


127-196 


(172) RLLRNR 


nd 


AHA (306) DGAWEQLLLG 


MdHsfA9b 


139-239 


241-308 


(324) KR-X 8 -KRRR 


(258) LKADQD 


nd 


MdHsfA9b 


139-239 


241-308 


(324) KR-X 8 -KRRR 


(258) LKADQD 


nd 


MdHsfBlo 


6-99 


142-191 


(246) KGDEKMKGKK 


nd 


nd 


MdHsfBlb 


2-35 


78-127 


(181) KGEEKMKGKK 


(159) LDMEGG 


nd 


MdHsfB2o 


22-115 


154-197 


(167) RLRK 


nd 


nd 


MdHsfB3o 


19-112 


149-194 


(223) RKRKR 


(208) PKLFGVRLE 


nd 


MdHsfB3b 


22-116 


149-194 


(179) KRKCK (223) RKRKR 


(208) LKLFGVRLE 


nd 


MdHsfB4o 


21-114 


183-239 


(325) KNTK-Xg-KKR 


(366) LEKDDLGLQLM 


nd 


MdHsfB4b 


21-114 


180-240 


(327) KNTK-X 9 -KKR 


(368) LEKDDLGLHLM 


nd 


MdHsfClo 


7-100 


119-171 


(195) KKRR 


nd 


nd 


MdHsfClb 


9-102 


128-180 


( 204) KKRR 


nd 


nd 



Number in brackets indicates the position of the first amino acid present in the putative nuclear localization signal (NLS), nuclear export signal (NES) and activator 
(AHA) motifs in the C-terminal domains, nd, no motifs detectable by sequence similarity searches. 



motif 7 containing the AHA sequence was detected in the 
C-terminal parts of many MdHsf proteins, belonging to 
the A class. Furthermore, eight A-type Hsfs members, 
namely MdHsfAla-Ald, MdHsfA2a-b, MdHsfA5a-b were 
characterized by the presence of motif 13 which contained 
the NLS domains. Interestingly, all B Hsfs members exhib- 
ited motif 20, while MdHsfCla and MdHsfClb contained 
the motif 29 (Table 3). 

Phylogenetic analysis of apple Hsf proteins 

To investigate the evolution of Hsfs an unrooted phylo- 
genetic tree was generated by using the 25 Malus 
domestica Hsfs, 28 Populus trichocarpa Hsfs (PtHsfs) 
and 21 Arabidopsis thaliana Hsfs (AtHsfs). Populus and 
Arabidopsis were chosen because their full sequence 
genome has been released, and Hsf members have been 
well characterized [7,10]. Moreover, the former is a tree. 



Figure 3 shows the result of this analysis. Hsfs of Malus 
domestica, Arabidopsis thaliana and Populus trichocarpa 
were clearly grouped into three different clades correspond- 
ing to the main Hsf classes A, B and C. Within the A-type 
clade, nine distinct sub-clades were resolved, seven of 
which (Al, A2, A3, A4, A5, A8 and A9) comprised the 
apple Hsf sequences. The C-type Hsfs from the three plant 
species also constituted one distinct clade which appeared 
more closely related to the Hsf A-group. Correspondingly, 
the B-type Hsfs from the three plant species grouped in a 
separate clade. Two of the five sub-clades, B3 and B2, were 
paraphyletic. As expected, the duplicated Hsfs of Malus 
domestica clustered all together on the phylogenetic tree. 

In silico expression analyses of MdHsf genes 

Tissue specific expression of MdHsfs was investigated by 
counting the number of ESTs per tissue from EST 
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Name Combined p value 



MdHsfAla 
MdHsfAlb 
MdHsfAlc 
MdHsfAld 
MdHsfA2a 
MdHsfA2b 
MdHsfA3a 
MdHsfA3b 
MdHsfA3c 

MdHsfMa 
MdHsfA5a 
MdHsfA5b 
MdHsfA8a 
MdHsfA8b 
MdHsfA9a 
MdHsfA9b 
MdHsfBIa 
MdHsfBIb 
MdHsfB2a 
MdHsfB3a 
MdHsfB3b 
MdHsfB4a 

MdHsfB4b 
MdHsfCIa 
MdHsfClb 



0.00e+00 
0.00e+00 
0.00e+00 

0.00e+00 
9.41 ©-185 
3.76 e-192 
3.37e-237 

8.20e-237 
0.00 e+00 

4.25 e-126 

6.61 e-237 

1.11e-266 
2.88 e-204 

2.16e-191 
6.88 e-166 

3.22 e- 164 
2.00 e-104 
3.79 e-44 
2.87 e-92 
1.81 e-102 

3.02 e-103 
1 .47 e-220 

1.59 e-222 

4.70 e- 155 

4.04 e-155 

SCALE 





Figure 2 Motifs identified by MEME tools in apple Hsfs. Thirty motifs were identified and indicated by increasing number from 1 to 30. Their 
distribution and visualization on the full length protein sequences was performed by using the Expasy mydomain tools. 



libraries [17]. This resulted in the assignment of MdHsfs 
to nine groups on the basis of the tissue and organ types 
in which MdHsfs were present (Table 4). 

Of the group Al, MdHsfAla and MdHsfAld, were the 
most represented as their expression was detected in 
leaf, flower, fruit, shoot and phloem. Similarly, 
MdHsfBIa and MdHsfBIb of B class were expressed in 
several apple tissues. Interestingly, MdHsfA9b was the 
only Hsf specific for seed, whereas MdHsfA9a was found 
in leaf. Furthermore, expression restricted to only a sin- 
gle tissue type was observed also for other members of 
the MdHsf family; all A3-type MdHsfs were expressed in 
shoot and both members of the class C were found in 
root. In addition, the analysis of digital data showed that 
duplicated genes located on different chromosomes 



had identical expression patterns (e.g. MdHsfB4a and 
MdHsfB4b, MdHsfCIa and MdHsfClb). 

Expression analysis of MdHsf genes in apple organs under 
natural environmental conditions 

Hsf genes are differentially expressed during flower and 
fruit development and are induced by abiotic environ- 
mental factors [7,12,13]. To investigate if MdHsfs are 
also involved in these processes, a comprehensive ana- 
lysis of their expression was performed in flowers and 
fruits from field-grown trees. Flowers were harvested at 
the stages of tight cluster, full pink and anthesis (FLS1, 
FLS2, FLS3) during spring at average temperatures of 
23°C/7°C (day/night; max/min), while the developing 
fruits were chosen at the stage of 10, 15 and 20 mm in 
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Table 3 Motif sequences identified by MEME tools in apple Hsfs 



Motif 




Length 


Best possible match 


1 




50 


TDHIISWSDANNSFWWDPPEFARDLLPKYFKHNNFSSFIRQLN^GFRK 


2 




29 


VDPDRWEFANEWFQRGQKHLLCNIHRRKH 


3 




50 


LMQEIVRLRQQQQ\TQNQLHAMNQRLQGMECRQQQMMSFLAKAMHNPGFL 


4 




21 


LH^GPPPFLCKTYDMVDDPA 


5 




29 


YQQ N PTG ACVEVG KCGLWDEIERLKRDKN 


6 




26 


CHKYMDGQIVKYQPPMNEAAKAMLRP 


7 




29 


TPYTH PDIVN Dl FWEQF WARPICGN I EE 


8 




50 


FEQSPHYPSQmGKLGLDAESTAFQFVDAALDELAITQGFLETPEQEGE 


9 




28 


MLMSELAHMKKKCNEIIYFVANYVCMAW 


10 




20 


GWDKSQN M N H ITEQMG H LTS 


11 




41 


QTDWIPELTRIQGIVPEGNVDIPNANMIGEDIGNGFYMGM 


12 




41 


EFEAFCSVNPLGAFDREKVSIPTSSMGGGGAEDVWPPQP 


13 




20 


HVH KN EKN RRITGYN KKRRL 


14 




49 


IINPDAMLITKAPTGATNTRNSSQPGYG\TNGGGGHISCEVNYPTESTP 


15 




50 


MSPKDESHPKSPPTSAEFDPESIGLSEFRPQVSAPLLGSQPIPSFTSPVM 


16 




50 


PSNSYPSSMLLCNPQPPKHNGPNGNLNQLQGYYPAAPPPNAKQNPHHIMN 


17 




49 


MIKQEDIWSMGFGVSAGMSTSMHELWGNPVNYDVPEMG\^GGLLDVWDI 


18 




26 


AQPHQVGLNHHHHHHSPLGMNGHHHH 


19 




50 


DGFIDPTSEVMNGSLPIDFDDISSDIEAFLKDWDDIIQNPGADEMDSTCA 


20 




14 


EEECKNLKLFGVWL 


21 




15 


VRRKFVKHQQHELSK 


22 




45 


QQLMQKRMIKRELDGGDLGKRRRLPPAQGIESFDEWINDSLSFDC 


23 




50 


FHQDFSSKLRLELSPAVSDMNLVSRSTQSSNEDGGSSTRKISEELKGAQM 


24 




50 


GASSM\/TEDPFFKGKSVLSPQQEANPERYVSFQEDLVKDRTFPELFSPGM 


25 




41 


NSGSEKQPEVDAYMDGMEDFWNPDFMKMLMDEKLSPVENH 


26 




41 


FFPFPSRGSISPSDSDEQPNWCDSDSPPLLSPTGGINTNIN 


27 




40 


PRMIQEIDYSAAAELGEKAKMVMMIAFTSSTAADDD^ 


28 




11 


THVHDHQQQPP 


29 




41 


ISSSPEAGFEMESFNRYPTPPEVQTASDWLRQRWFVDRVRA 


30 




50 


f~\ ic r~\ ati r\r\ /ni ~~rc r~ i \r~ i nn;ir i — ri irnnrn /a k i r\r — n /\ /inrnrr r"\\ i\ i a i \ if~ 

CVSGV I LQbVPL I SGHGLPSVIbb I HSPPRVANPG I VMRSPhSDVNALVG 


Numbers in the first column indicate the motifs represented in the Figure 2. 




diameter (FUS1, FUS2, FUS3) and harvest at average 


oimnar to tne /\±-suDgroup, nigner messenger 


temperatures of 23°C/14°C (day/ night; max/min). Quan- 


levels at antnesis were aiso ODservea ior otner mem- 


titative real-time PCR was used as the approach to 


Uclb Ol r\ Cldbi> SUCH db IVlUriSJ/XZCl-U) l\L(AnSJr\C>U-Lf 


monitor gene 


expression 


changes, and MdHsf transcript 


JVlctrlsjA.DCi-u ana NiarisjAya,, interestingly, NiarisjAyo 
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Figure 3 Neighbor-joining phylogeny of Hsfs from M. domestica, P. trichocarpa and A. thaliana. The phylogenetic tree was obtained using 
the MEGA 5.0 software on the basis of amino-acid sequences of the N-terminal domains of MdHsfs including the DNA-binding domain, the HR- 
A/B region and the linker between both regions. The final dataset included a total of 281 positions. Evolutionary distances were computed using 
the Jones-Taylor-Thornton matrix-based method and by removing all ambiguous positions for each sequence pair. Numbers indicate bootstrap 
values >80 based on 1000 replicates for the major nodes. The abbreviations of species names are as follows: Md, Malus domestica; Pt, Populus 
trichocarpa; At, Arabidopsis thaliana. 



lower expression in flower/fruit than in leaf. Low tran- 
script abundances in fruit as compared to flower or leaf 
were also observed for MdHsfCla-b. 

To further characterize the expression of Hsf family 
genes in apple, the quantitative real-time PCR analysis was 
extended to leaf samples harvested from field-grown trees 
exposed to naturally increased temperatures. Leaf samples 
were taken during the summer period, at two different 
temperature ranges: at 26°C/12°C (day/night; max/min) on 
30th July 2011, which were used as reference, and at high 
temperature average of 32°C/17°C (day/night; max/min) on 
the 21st August 2011 (Additional file 1: Figure SI). 

The transcriptional analyses revealed that in leaf most 
of the MdHsfs genes were responsive to the increased 
temperatures (Figure 5). Twelve of these responsive 
genes showed transcript accumulation significantly 
higher than the reference sample, while only MdHsfA9b 
and MdHsfB4a-b were strongly down-regulated in response 



to the increased temperatures. A 4-fold or higher in- 
crease of expression levels in response to high tem- 
peratures was observed for MdHsfA2a-b, MdHsfA3b-c, 
MdHsfBla, MdHsfB2a, MdHsfB3a-b and MdHsfCla-b, 
and only slightly higher in the stressed leaves than in 
the reference onces for MdHsfA4a, MdHsfASa-b and 
MdHsfASa-b, Furthermore, all subgroup Al members 
such as MdHsfAla-d did not show any significant tran- 
scriptional changes in response to the high temperatures 
compared to the reference conditions. 

Discussion 

In plants, members of the Hsf family have been described 
as key regulators in molecular and cellular responses to 
stress conditions [1,7]. Furthermore, data from tomato 
and Arabidopsis have shown that the Hsfs are important 
components involved in developmental signalling [13,14]. 
Both size and composition of the Hsf family have been 
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Table 4 Digital expression of MdoHsf genes 



Tissue and organ type (DFCI Apple Gene Index) 


Gene name Leaf Root Flower 


Fruit 


Shoot 


Phloem 


Xylem 


Seed 


Bud 



MdHsfMa 
MdHsfAlb 
MdHsfAlc 
MdHsfAld 
MdHsfA2o 
MdHsfA2b 
MdHsfA3o 
MdHsfA3b 
MdHsfA3c 
MdHsfA4o 
MdHsfA5o 
MdHsfA5b 
MdHsfA8o 
MdHsfA8b 
MdHsfA9o 
MdHsfA9b 
MdHsfBlo 
MdHsfBlb 
MdHsfB2o 
MdHsfB3o 
MdHsfB3b 
MdHsfB4o 
MdHsfB4b 
MdHsfClo 
MdHsfClb 



+: Expressed; blank: not expressed. 

analyzed and characterized in different plant species [1]. 
The present study investigates for the first time this gene 
family in the economically relevant domesticated apple 
and shows that its genome contains 25 full length Hsf 
genes. This number is similar to that of Populus tricho- 
carpa for which 28 loci encoding Hsf proteins were found 
[10]. Velasco et al. [2010] have shown that genome wide 
duplications had occurred in apple causing the expansion 
of several gene classes. Indeed, it was found that the en- 
largement of the MdHsf family is in particular originated 
from segmental duplications between different chromo- 
somes. This situation is similar in maize and in Populus, in 
which segmental Hsf gene duplications were more preva- 
lent than those of tandem duplications [9,10]. Gene dupli- 
cations have an important role not only in the genomic 
rearrangement and expansion but also in diversification of 
gene function. In particular, genes encoding for nucleic acid 
binding proteins, among which transcription factors, origi- 
nated mostly by segmental duplication. In contrast, mem- 
brane proteins and proteins involved in the stress response 



are encoded by genes mainly duplicated in tandem [18,19]. 
Therefore, the prevalence of segmental duplication events 
in MdHsf expansion may be associated to the fact that these 
genes act as transcriptional regulators. 

Malus, Arabidopsis and Populus belong to the Rosid 
lineage and they are grouped in two distinct clades, 
namely Fabids (Malus and Populus) and Malvids 
(Arabidopsis) [20]. It was observed in the present study 
that the majority of the MdHsf s had a closer phylogenetic 
relationship to the PtHsfs than to the AtHsfs. This may be 
attributable to the fact that Malus and Populus belong to 
the same Fabids clade, and as they are both trees may have 
adapted to prolonged and repeated environmental con- 
straints, unlike Arabidopsis, 

Functional diversification of multifamily duplicated 
genes has been observed in trees. For example, the fam- 
ily of the glutathione S -transferase in Populus has a clear 
divergence in expression patterns in response to differ- 
ent stress treatments [21]. Therefore the presence of 
many duplicated Hsf genes in the apple genome may be 
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Figure 4 Expression analyses of MdHsfs in developing flowers and fruits. Quantification of messenger RNA levels was performed in 
developing flowers corresponding to the tight cluster, full pink and full bloom stages (FLS1, FLS2, FLS3) and in developing fruits of 10, 15 and 20 
mm in diameter (FUS1, FUS2, FUS3). The relative expression of MdHsf genes in flower/fruit/different stages was calculated in relation to young 
leaves of 3-5 cm in length. The qRT-PCR analysis results were normalized using EF1, Tip-41 and IMPA9 as housekeeping genes. Each bar represents 
the average of the relative expression levels from three biological replicates. 



related to the fact that a sub-functionalization has taken 
place especially to cope with prolonged and specific 
stress conditions. 

MdHsf genes were found to be expressed in several 
apple tissues. In particular, members belonging to the 
Al and Bl subclasses, such as MdHsfAla, MdHsfAld, 
MdHsfBIa, MdHsfBIb, were constitutively expressed in 
different tissues. A similar situation was found in other 
plants like Arabidopsis where Al-type Hsfs were 
involved in house-keeping processes under normal con- 
ditions, being ready for the fast activation of other Hsfs 
genes following stress treatment [22,23]. Furthermore, 
expression data from flower and fruit tissues indicated 
that some duplicated gene pairs, e.g. MdHsf A9a and 
MdHsf A9b, exhibited differences in their expression 
levels. This suggests that they may be subjected to a dif- 
ferent regulation in apple tissue [1,7]. 



In contrast, the expression of MdHsf A2a and MdHsf A2b 
was mainly detected in full bloom flowers. AtHsfA9 and 
LeHsfA2a (Le, Lycopersicon esculentum) were found 
expressed in seed and developing pollen grains [13,14,24]. 
It was shown that the presence of these Hsfs during plant 
development is important for heat shock protein activa- 
tion. This suggests that MdHsf A2a and MdHsf A2b may 
be important during pollination and fertilization, which 
occurs at anthesis. 

Effects of heat stress (HS) on Hsf gene expression 
has been examined in several plant species, but no 
data are available about Hsf expression in trees 
exposed to naturally increased temperatures. Under 
laboratory settings, it was shown that AtHsfAla and 
AtHsfAlb regulate the early response to HS in Arabi- 
dopsis [22,25]. AtHsfA2 is rapidly induced by HS, and it is 
involved in enhancing and maintaining of HS-response 
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Figure 5 Expression analyses of MdHsfs in vegetative leaf tissue under heat stress. Messenger RNA levels of MdHsfs were analyzed by 
qRT-PCR in leaf samples from trees grown under field conditions and exposed to different temperature ranges: reference temperature, average of 
26°C/12°C (day/night; max/min), considered as normal conditions; and high temperature, average of 32°C/17°C (day/night; max/min), considered 
as stress conditions. The qRT-PCR analysis results were normalized using EF1,Hp-41 and I MPA9 as housekeeping genes. Each bar represents the 
average of the relative expression levels from three biological replicates. 
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when plants are exposed to prolonged or repeated 
cycles of HS [26,27]. Similarly to AtHsfA2, AtHsfA3 
is involved in thermo-tolerance mechanisms [7,28,29]. 
The Al-type MdHsfs were expressed at the same level 
also in leaves from plants growing in field and exposed 
to different temperature conditions. MdHsfA2a-b, 
MdHsfA3b-c were instead strongly induced. This may 
suggest that these types of MdHsfs could be involved 
in maintaining the stress response when apple trees 
are exposed to prolonged periods of high temperature 
conditions. 

In contrast to class A Hsfs, genes assigned to the B 
and C classes have so far not been fully characterized. 
Members of the B class were shown to act mainly as 
repressors of the expression of HS inducible genes 
[30,31]. Some of them form a complex with Hsf A-types 
to maintain housekeeping gene expression during HS 
regimes [32]. Therefore, the strong transcriptional acti- 
vation in apple may indicate that some of them may 
have a role in the response to the high temperatures also 
in this species. For the majority of MdHsfs, increased 
messenger RNA levels were observed under naturally 
increased temperatures. However, MdHsfA9b and 
MdHsfB4a-b were the only Hsf genes showing low tran- 
script abundance. Although proteomic data are not 
available for all MdHsfs genes, their activation or repres- 
sion may suggest that these transcripts could have a 
high hierarchy of molecular events induced by the 
high temperatures. 

Conclusions 

The complexity of the Hsf family has been object of 
many investigations in different plant species. Here, 25 
full length Hsfs genes were identified in the apple gen- 
ome. Based on structural characteristics of the proteins 
and on the comparison with homologues from other 
species, the 25 MdHsfs were grouped in three different 
classes. Segmental and tandem duplications were exam- 
ined and contributed to the expansion of the Hsf family 
in the apple genome. The expression profiles in flowers/ 
fruits at different developmental stages as well as in 
leaves exposed to naturally increased temperature indi- 
cated that MdHsfs may play a role in different aspects of 
apple growth/development. 

Malus domestica represents an economically important 
woody plant whose genome has been fully sequenced and 
whose commercial value is due to fruit production in the 
field. Therefore, understanding the role of protective genes 
as the Hsfs during development and under stress conditions 
is important. The results of this research will be undoubt- 
edly useful for future gene cloning and functional studies 
and, in turn, for producing apple cultivars with improved 
genetic traits. 



Methods 

Identification and classification of Hsfs in Malus domestica 

The recently sequenced apple genome was investigated 
for putative genes encoding for MdHsfs (Md: Malus 
domestica) based on BLASTN and BLASTP in NCBI 
and TIGR- Apple databases [11,15]. Physical localization 
of all candidate MdHsfs was analyzed in order to reject 
redundant sequences with the same chromosome loca- 
tion. In order to identify signature domains, the MdHsf 
sequences were compared to the Hsf proteins of Arabi- 
dopsis and tomato by amino acid sequence alignment 
using ClustalW (version 1.83). Presence of DBD domains 
and coiled-coil structures were checked by SMART and 
MARCOIL programs [33,34]. In addition, identification 
of putative domain motifs in the full-length amino acid 
sequences of the MdHsfs was also performed by MEME 
tools [35]. Visualization of the Meme motifs in the 
MdHsfs was performed by using Expasy tools (http:// 
prosite.expasy.org/mydomains). MdHsf names were 
assigned on the basis of the original nomenclature as 
worked out for the Arabidopsis thaliana Hsf family, and 
later applied to other plant Hsf families [2,7]. Classifica- 
tion into three different groups A, B and C was based 
on the information of oligomerization domains [2]. 

Phylogenetic analysis and gene duplication of MdHsfs 

Gene duplications in the apple genome were analyzed by 
testing the similarity of all MdHsf genes using ClustalW. 
A gene duplication was defined according to the follow- 
ing criteria: (1) the length of the sequence alignment 
covered > 80 % of the longest gene, and (2) the similarity 
of the aligned gene regions was > 80 % [36,37]. Data 
were then plotted using Circos software [38] . 

To understand the evolutionary relationships of the 
MdHsf proteins, a phylogenetic tree was constructed. 
The N-terminal Hsf protein sequences containing the 
DBD and HR-A/B regions from Malus domestica, Arabi- 
dopsis thaliana and Populus trichocarpa [7,10] were 
aligned using ClustalW. A phylogenetic tree was con- 
structed using the Neighbor Joining (NJ) method in 
MEGA (version 5.0) [39]. Based on the results of the 
model selection analysis, the Jones-Taylor-Thornton 
matrix-based method was used to compute evolutionary 
distances [40]. The rate variation among sites was mod- 
eled with a gamma distribution (shape parameter = 
0.67). Bootstrap analysis was conducted with 1000 repli- 
cates to assess statistical support for each node. 

Digital and EST expression analysis 

The analysis of MdHsfs expression profiles was inves- 
tigated at the transcriptional level. MdHsfs expression 
patterns were searched with the BLAST program in 
TIGR- Apple EST libraries [17] using the following 
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parameters: maximum identity > 95%, length > 200 bp 
and E-value <10~ 10 . 

Plant material 

Experiments were carried out in 2011 on 18-year-old 
apple trees (cultivar 'Golden Delicious' on M9 rootstock) 
trained with standard horticultural practices at the ex- 
perimental farm of the Research Centre for Agriculture 
and Forestry Laimburg (South Tyrol, Italy). Samples 
were taken from 24 homogeneous trees grouped in 3 
biological replicates each containing 8 trees distributed 
in the same block of the orchard. Tissue samples were 
collected between April and August 2011 from trees 
grown under field environmental conditions and exposed 
to natural variations of temperature and solar radiation. 
Temperature data are reported in the Additional file 1. 
Young leaves (3-5 cm in length) as well as developing 
flowers corresponding to the tight cluster (FLS1), pink 
(FLS2) and full bloom (anthesis, FLS3) stages were har- 
vested from the plants during spring period and under 
max-minimum temperature average in the range of 23°C/ 
7°C (day/night; max/ min). From the same trees develop- 
ing fruits of 10 mm (FUS1), 15 mm (FUS2) and 20 mm 
(FUS3) in length were also collected under max-minimum 
temperatures of 23°C/14°C (day/ night; max/min). For test- 
ing Hsfs gene expression under naturally increased 
temperature conditions, leaf samples were taken during 
the summer period, at two different temperature ranges: 
at 26°C/12°C (day/night; max/min) on 30th July, 2011, 
which were used as reference, and at high temperature 
average of 32°C/17°C (day/night; max/min) on the 21st 
August, 2011 (Additional file 1: Figure SI). All samples 
used in gene expression analyses were harvested at mid- 
day (12:00 am) and were positioned around 1.60 m in 
height from the soil. 

RNA isolation and quantitative real-time PCR (qRT-PCR) 
analyses 

Total RNA was isolated from apple tissues with the hot 
phenol method [41]. RNA quantity was measured using 
a NanoDrop ND-1000 spectrophotometer, and its qual- 
ity was checked by agarose gel electrophoresis. For re- 
verse transcription, total RNA was incubated with 
RNase-free DNase (RQ1; Promega, Madison, WI), and 1 
ug was used for reverse transcription according to the 
manufacturers instructions (Superscript Vilo cDNA 
Synthesis kit; Invitrogen). 

The qRT-PCR analyses were carried out on a 7500 
Fast Real-time PCR System (Applied Biosystems) with 
the ROX Reference Dye. Each reaction contained 12.5 ul 
SYBR GreenER qPCR SuperMix Universal (Invitrogen), 
20 ng of cDNA and 400 nM of each specific primer. The 
qRT-PCRs were performed using a controlled temperature 
program starting with 10 min at 95°C, followed by 40 



cycles of 15 s at 95°C and 60 s at 60°C. To verify the pres- 
ence of a specific product, the melting temperature of the 
amplified products was determined. In addition, each PCR 
mixture was analyzed on a 2% agarose/ethidium bromide 
stained gel to verify the size of the amplified DNA frag- 
ment. The primers used for the qRT-PCRs were designed 
using Quantprime software and are reported in the 
Additional file 2 [42]. The qRT-PCRs were performed 
in duplicated technical reactions and repeated on three 
independent biological replicates. Relative mRNA levels of 
the target genes were calculated based on Vandesompele 
et al. [2002] [43]. The genes encoding for elongation 
factor 1 alpha subunit (eF-1 alpha; accession number 
AJ223969.1), Importin alpha Isoform9 (IMPA-9; ac- 
cession number CN909679) and Tip-41 like protein 
(Tip-41 CN941833) were used as references in the 
qRT-PCR analyses. 

Additional files 



Additional file 1: Figure SI. Average temperature in the orchard 
where apple trees were sampled during the 201 1 growing season. The 
data show temperature ranges obtained from a meteorological station 
located in the apple orchard and positioned around 2 m in height. Each 
point represents the average calculated on the basis of data from seven 
days. 

Additional file 2: Primer sequences used for quantitative real-time 
PCR analyses. 
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